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phenomenology, especially at the LHC. The uncertainties of the parton distribu- 
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1. Introduction 

Global QCD analysis of the parton structure of the nucleon has made significant 
progress in recent years. However, there remain many gaps in our knowledge of the 
parton distribution functions (PDFs), especially with regard to the strange, charm 
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and bottom degrees of freedom. The uncertainties of the PDFs remain large in both 
the very small-x and the large-x regions — in general for all flavors, but particularly for 
the gluon. Uncertainties due to the input PDFs will be one of the dominant sources 
of uncertainty in some precision measurements (such as the W mass) [1], as well as in 
studying signals and backgrounds for New Physics searches at the Fermilab Tevatron 
and the CERN Large Hadron Collider (LHC). Thus, improving the accuracy of the 
global QCD analysis of PDFs is a high priority task for High Energy Physics. 

With the accumulation of extensive precision deep inelastic scattering (DIS) cross 
section measurements of both the neutral current (NC) and charged current (CC) 
processes at HERA I (and even more precise data from HERA II to come soon), it 
is necessary to employ reliable theoretical calculations that match the accuracy of 
the best data in the global analysis. In the perturbative QCD framework (PQCD), 
this requires, among other things, a proper treatment of the heavy quark mass pa- 
rameters.^ There are many aspects to a proper treatment of general mass effects, 
involving both dynamics (consistent factorization in PQCD, with quark massess) and 
kinematics (physical phase space constraints with heavy flavor masses that are not 
satisfied by the simplest implementation of the zero-mass parton model formula). 
Some aspects of these considerations have been applied in existing work on global 
QCD analysis; however, in most cases, not all relevant effects have been consistently 
taken into account. 

In this paper, we present a systematic discussion of the relevant physics issues, 
and provide a new implementation of the general formalism for all DIS processes in 
a unified framework (Sec. §). We show the magnitude of the mass effects, compared 
to the conventional zero- mass (ZM) parton formalism (Sec.^. We then apply this 
simple implementation of the general mass formalism to a precise global analysis 
of PDFs, including the full HERA I cross section data sets for both NC and CC 
processes, and taking into account all available correlated systematic errors (Sec.^). 
We investigate in some depth the parametrization dependence of the global analysis, 
assess the uncertainties of the PDFs in the new analysis, and compare the new 
results with those of CTEQ6.1 [2] (which was based on the ZM parton formalism) 
and other current PDFs [3] (SecH). The results, designated as CTEQ6.5 PDFs, have 
important implications for hadron collider phenomenology at the Tevatron and the 
LHC. Finally, we summarize our main results, discuss their limitations, and mention 
the challenges ahead (Sec.0). Some preliminary results of this investigation have 
been presented at the DIS2006 Workshop [4]. 



-'^Our discussions will be independent of the flavor of the heavy quark. In practice, "heavy 
quarks" means charm and bottom. The top quark is so heavy that it can generally be treated as a 
heavy particle, not a parton. 
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2. General Mass PQCD: 

Formalism and Implementation 



The quark-parton picture is based on the factorization theorem of PQCD. The con- 
ventional proof of the factorization theorem proceeds from the zero-mass hmit for all 
the partons — a good approximation at energy scales (generically designated by Q) 
far above all quark mass thresholds (designated by Mj). This clearly does not hold 
when Q/Mi is of order 1.^ It has been recognized since the mid-1980's that a consis- 
tent treatment of heavy quarks in PQCD over the full energy range from Q < Mj to 
Q ^ Mi can be formulated [5]. (This is most clearly seen in the CWZ renormaliza- 
tion scheme [6].) The basic physics ideas were further developed in [7]; this approach 
has become generally known as the ACOT scheme. In 1998, Collins gave a general 
proof (order-by-order to all orders of perturbation theory) of the factorization theo- 
rem that is valid for non-zero quark masses [8]. The resulting theoretical framework 
is conceptually simple: it represents a straightforward generalization of the conven- 
tional zero-mass (ZM) modified minimal subtraction (MS) formalism. This general 
mass (GM) formalism is what we shall adopt. 

The implementation of the general mass formalism requires attention to a num- 
ber of details, both kinematical and dynamical, that can affect the accuracy of the 
calculation. Physical considerations are important to ensure that the right choices 
are made between perturbatively equivalent alternatives that may produce noticeable 
differences in practical applications. We now systematically describe these consider- 
ations, and spell out the specifics of the new implementation used in our study. For 
simplicity, we shall often focus on the charm quark, and consider the relevant issues 
relating to the calculation of structure functions at a renormalization and factoriza- 
tion scale fi (usually chosen to be equal to Q) in the neighborhood of the charm mass 
Mc- The same considerations apply to the other heavy quarks, and to the calculation 
of cross sections. 

2.1 The Factorization Formula 

Let the total inclusive differential cross section for a general DIS scattering process 
be written as 



where F^{x, Q) are structure functions representing the forward Compton amplitudes 
for the exchanged vector bosons on the target nucleon, Lx{x,y) are kinematic fac- 
tors originating from the (calculable) lepton scattering vertices, and N is an overall 

^Heavy quarks, by definition, have Mj 3> A.qcd- Hence we always assume Q,Mi ^ ^qcd- 




(2.1) 
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factor dependent on the particular process.^ We follow the notation of Ref. [9], from 
which detailed formulas for the above factors can be found. The PQCD factorization 
theorem for the structure functions has the general form 



(2.2) 



Here, the summation is over the active parton flavor label a, /"(x, /x) are the parton 
distributions at the factorization scale /i, are the Wilson coefficients (or hard- 
scattering amplitudes) that can be calculated order-by-order in perturbation theory, 
and we have implicitly set the renormalization and factorization scales to be the same. 
The lower limit of the convolution integral ( is usually taken to be x = Q^/2g ■ p — 
the Bjorken x — in the conventional ZM formalism; but this choice needs to be re- 
considered when the Wilson coefficients include heavy quark mass effects, as we shall 
do in Sec.p]^ below. In most applications, it is convenient to choose fi = Q; but 



there are circumstances in which a different choice becomes useful. For DIS at order 
as and beyond, the results are known to be quite insensitive to the choice of fi. 

2.2 The (Scheme-dependent) Parton Distributions and Summation over 
Parton Flavors 



The summation over "parton flavor" label a in the factorization formula, Eq. ( |2.2|) , 
is determined by the factorization scheme chosen to define the Parton Distributions 

In the fixed flavor number scheme (FFNS), one sums over a = g,u,u,d,d, ... up 
to Uf flavors of quarks, where Uf is held at a flxed value (3,4, ...). For a given uj 
(say Uf = 3), the n/-FFNS has only a limited range of applicability, since, at order 
m of the perturbative expansion, the Wilson coefficients contain logarithm terms of 
the form a^ln'"(Q/Mj) for i > Uf, which is not infrared safe as Q becomes large 
compared to the renormalized mass of the heavy quark Mj (e.g. M^^h, in the Uf = 3 
example). 

The more general variable flavor number scheme (VFNS), as defined in Collins' 
general framework [5, 8], is really a composite scheme: it consists of a series of FFNS's 
matched at conveniently chosen match points fii, one for each of the heavy quark 
thresholds. At /ij, the nj-flavor scheme is matched to the {uf + l)-flavor scheme by 
a set of perturbatively calculable flnite renormalizations of the coupling parameter 



^The summation index A can represent either the conventional tensor indices {1,2,3} or the 
hehcity labels {Right, Left, Longitudinal}. In the zero-mass case, these are the only independent 
structure functions. However, in the general mass case, there are five independent hadronic structure 
functions. In addition to the usual two parity (and chirality) conserving, say F2 and Fiong, and one 
parity violating, F3, structure functions, there are also two chirality-violating amplitudes (one each 
for F2 and Fiong)- These are proportional to gagL, where gR^L are the electroweak couplings of the 
vector boson to the quark line to which it is attached. [7] 
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the mass parameters {Mj}, and the parton distribution functions {/"}. The 
matching scale /Zj can, in principle, be chosen to be any value, as long as it is of 
the order of Mj. In practice, it is usually chosen to be /ij = Mj, since it has been 
known that, in the commonly used (VFNS) MS scheme, all the renormalized quan- 
tities mentioned above are continuous at this point up to NLO in the perturbative 
expansion [5]. Normally, when the VFNS is applied at a factorization scale /i that 
lies in the interval /i„^+i), the n/-fiavor scheme is used. Thus the number of 
active parton flavors depends on the scale /i — henceforth denoted by nf{fi) — and it 
increases by one when /i crosses the threshold /Xj+i from below. ^ 

PQCD in the VFNS is free of large logarithms of the kind mentioned above 
for the FFNS — it is infrared safe, and hence remains reliable, at all scales fi (~ Q) 
^ ^QCD- In this scheme, the range of summation over "a" in the factorization 



formula, Eq. ( |2.2| ), is 0, 1, ...,nf{fi), where represents the gluon, 1 represents u and-u, 
etc. 

Our implementation of the general mass formalism includes both FFNS and 
VFNS. In practice, however, for reasons already mentioned, we shall mostly use the 
VFNS. The definition of parton distributions in the scheme described above is exactly 
the same as that of previous CTEQ PDFs, all being based on [5] . (It is also shared, 
essentially, by most global analysis groups, e.g. MRST [10].) The improvements of the 
formalism over previous analyses reside in the consistent and systematic treatment 
of mass effects in the Wilson coefficients and in the phase space integral of Eq. (|2.2|) 
that we shall now describe and clarify. 

2.3 The Summation over (Physical) Final-state Flavors 

For total inclusive structure functions, the factorization formula, Eq. ( p.2|) , contains 
an implicit summation over all possible quark flavors in the final state. One can 
write, 

^a = J2^'a (2.3) 

b 

where "6" denotes final state fiavors, and o)^ is the perturbatively calculable hard 
cross section for an incoming parton "a" to produce a final state containing fiavor 
"6 " . (Cf. the Feynman diagrams contributing to the calculation of the hard cross 
section in Sec.^]5| below. A caveat on the definition of o)^, is described in Sec, p^.) 

It is important to emphasize that "6" labels quark fiavors that can be produced 
physically in the final state; it is not a parton label in the sense of initial-state parton 

■^Since, once properly matched, all the component FFNS's are well defined, and can co-exist 
at any scale, it is possible to choose the transition from the n/- to the {uf + l)-flavor scheme 
at a transition scale that is different from the matching scale fXi. It is, arguably, even desirable 
to make the transition at a scale that is several times the natural matching scale = Mi, since, 
physically, a heavy quark behaves like a parton only at scales reasonably large compared to its mass 



parameter. But, with the rescaling prescription that we shall adopt in Sec. 2.4, the same purpose 
can be achieved with less technical complications. 
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flavors described in the previous subsection. The latter (labelled a) is a theoretical 
construct and scheme-dependent (e.g. it is fixed at three for the 3-fiavor scheme); 
whereas the final-state sum (over b) is over all flavors that can be physically pro- 
duced. The initial state parton "a" does not have to be on the mass-shell. But the 
final state particles "6" should be on-mass-shell in order to satisfy the correct kine- 
matic constraints and yield physically meaningful results.^ Thus, in implementing 
the summation over final states, the most relevant scale is W — the CM energy of the 
virtual Compton process — in contrast to the scale Q that controls the initial state 
summation over parton flavors (see next subsection). 

The distinction between the two summations is absent in the simplest implemen- 
tation of the conventional (i.e., textbook) zero-mass parton formalism: if all quark 
masses are set to zero to begin with, then all flavors can be produced in the final 
state. This distinction becomes blurred in a zero-mass (ZM) VFNS — the one com- 
monly used in the literature (including previous CTEQ analyses) — where the number 
of effective parton fiavors is incremented as the scale parameter crosses a heavy 
quark threshold, but other kinematic and dynamic mass effects are omitted. Thus, 
the implementation of the ZM VFNS by different groups can be different, depending 
on how the final-state summation is carried out. This detail is usually not spelled 
out in the relevant papers. 

It should be obvious that, in a proper implementation of the general mass (GM) 
formalism, the distinction between the initial-state and final-state summation must 
be unambiguously, and correctly, observed. For instance, even in the 3-flavor regime 
(when c and b quarks are not counted as partons), the charm and bottom fiavors still 
need to be counted in the final state — at LO via + d/ s ^ c or W~ + u/c ^ b, 
and at NLO via the gluon- fusion processes such as W'^ + g — > s + cor 'y + g — > cc (66), 
provided there is enough CM energy to produce these particles. 

This issue immediately suggests that one must also give careful consideration 
to the proper treatment of the integral over the final-state phase space and other 
kinematical effects in the problem. 

2.4 Kinematic Constraints and Rescaling 

Once mass effects are taken into account, kinematic constraints have a significant 
impact on the numerical results of the calculation; in fact, they represent the dom- 
inant factor in the threshold regions of the phase space. In DIS, with heavy fiavor 
produced in the final state, the simplest kinematic constraint that comes to mind is 



^Strict kinematics would require putting the produced heavy flavor mesons or baryons on the 
mass shell. In the PQCD formalism, wc adopt the approximation of using on-shell final state heavy 
quarks in the underlying partonic process. 




(2.4) 
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where W is the CM energy of the vector-boson-nucleon scattering process, Mn is the 
nucleon mass, and the right-hand side is the sum of all masses in the final state. Since 
W is related to the familiar kinematic variables [x, Q) by — = Q'^{1 — x)/x, 
this constraint can be imposed by a step function 6{W — M^v — Ylf^f) condition 
on the right-hand side of Eq. ( |2.2|) , irrespective of how, or whether, mass effects 
are incorporated in the convolution integral. Although that simple approach would 
represent an improvement over ignoring the kinematic constraint Eq. ( p.4| ), it is too 
crude, and can lead to undesirable discontinuities. 

A much better physically motivated approach is based on the idea of rescaling. 
The simplest example is given by charm production in the LO CC process + s c. 
It is well-known that, when the final state charm quark is put on the mass shell, 
the appropriate momentum fraction variable for the incoming strange parton, x iii 
Eq. ( |2.2|) , is not the Bjorken x, but rather x = + M^/Q"^) [11]. This is commonly 
called the rescaling variable. 

The generalization of this idea to the more prevalent case of NC processes, say 
■j/Z + c c {or any other heavy quark), took a long time to emerge [12], because this 
partonic process implies the existence of a hidden heavy particle — the c — in the target 
fragment. The key observation was, heavy objects buried in the target fragment are 
still a part of the final state, hence must be included in the phase space constraint, 
Eq. ( |2.4|) . Taking this effect into account, and expanding to the more general case 
of •y/Z + c a + X, where X contains only light particles, it was proposed that 



the convolution integral in Eq. (2^) should be over the momentum fraction range 
Xc < ^ < 1, where 

In the most general case where there are any number of heavy particles in the final 
state, the corresponding variable is (cf. Eq. ( p.4|) ) 



Xc = x(l + -f ) . (2.5) 



X = x(l + ^^^^j . (2.6) 

This rescaling prescription has been referred to as ACOTx in the recent literature 
[10, 12]. 

Fig.|l| helps to visualize the physical effects of rescaling for charm production in 
NC DIS. In this plot, we show constant x and constant Xc lines. The threshold 
for producing charm corresponds to the line W = 2Mc (lower right corner), which 
coincides with Xc = ^- Far above threshold, Xc — x. Close to the threshold, Xc can 
be substantially larger than x. For fixed (and sufficiently large) x, as Q increases 
(along a vertical line upward in the plot, such as a; = 0.1), the threshold is crossed at 
= 4M^a;/(l — x) (point C on the plot), beyond which Xc decreases, approaching 
X asymptotically (point D on the plot). For fixed Q, as x decreases from 1 (along a 
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Figure 1: Kinematics of rescaling due to Mc in the (x, Q) plane for NC DIS. The curves are 
constant x {Xc) hues. They asymptotically approach the corresponding Bjorken x values 
(vertical lines). The solid diagonal straight line marks the kinematic reach of HERA. 

horizontal line to the left), the threshold is crossed at x = (1 + 4M^/ Q"^) ^, below 
which Xc is shifted relative to x according to Eq. ( |2.5| ) or (^.61). 

Rescaling shifts the momentum variable in the parton distribution function 
/'^(^,/i) in Eq. (|2.2|) to a higher value than in the zero-mass case. For instance, 
at LO, the structure functions at a given point A are proportional to /(a;, Q) in the 
ZM formalism; but, with ACOTx rescaling, this becomes /(Xc, Q)- The shift x ^ Xc 
is equivalent to moving point A to point B in Fig. |l]. 

In the region where (SjMj)^/(5^ is not too small, especially when /(^,/i) is a 
steep function of ^, this rescaling can substantially change the numerical result of 
the calculation. It is straightforward to show that, when one approaches a given 
threshold (Mat + Mj) from above, the corresponding rescaling variable x 1- 
Since generally f'^{S,-,^Ji) — > as ^ — 1, rescaling ensures a smoothly vanishing 
threshold behavior for the contribution of the heavy quark production term to all 
structure functions. This results in a universal^, and intuitively physical, realization 
of the threshold kinematic constraint for all heavy flavor production processes. 

2.5 Hard Scattering Amplitudes and the SACOT Scheme 

The last quantity in the general formula Eq. (p.2| ) that we need to discuss is the 

^ Since it is imposed on the (universal) parton distribution function part of the factorization 
formula. 
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hard scattering amplitude (^x, ^, ^^(/i)^. These amphtudes are perturbatively 
calculable. To facilitate the discussion, consider the special case of charm production 
in a neutral current process. At LO and NLO, the Feynman diagrams that contain 
at least one heavy quark (c or c) in the final state are depicted in Fig. ^ for both the 
3-flavor (lower) and 4-flavor (upper) schemes. 



1^ 



4-flavor 
scheme 



3-flavor 
scheme 



O(a') 



"T" 



WVAA/i 




^AAAA/ 



Subtraction- 
terms 



Mc T 1 



Figure 2: Partonic processes included in the theoretical nj-flavor scheme calculations 
(nj = 3,4) at order and a^. Generalization to higher orders is straightforward. 

For /i < Mc, the 3-fiavor scheme applies. In this scheme, there is no charm par- 
ton in the initial state. The only diagram contributing to charm production at order 
as is the gluon fusion diagram. If Mc is kept as nonzero, the hard scattering ampli- 
tude is finite; and the calculation is relatively straightforward. The hard scattering 
amplitude depends only on {x,Q/Mc), not on the factorization scale fi. 

For /i > Mc, the 4- flavor scheme apphes. The LO subprocess 'j/Z + c c, 
with a charm parton in the initial state, is of order a^. It represents the resummed 
result of collinear singularities of the form a" ln"'(/i/Mc), n = 1,2, from Feynman 
diagrams of all orders in n. Since the asln(/i/Mc) terms due to the NLO diagrams 
shown in Fig. ^ are already included in the resummation, they must be subtracted 
to avoid double-counting. These are denoted by the "subtraction terms" in Fig. ^ 
The subtraction term associated with the gluon-initiated diagram is of the form 
as ln(/i/Mc)a;°(Mc) J^{d^/0 di^^ lAPgqix/C) where a;°(Mc) is the LO hard scattering 
amplitude and Pgq is the g q splitting function of QCD evolution at order as- The 
specific choices adopted for x, fi, and the Mc dependence of u^{Mc) together precisely 
define the chosen factorization scheme. To be consistent, the same prescription must 
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be used in evaluating the LO term, which takes the form c{x, /i) uj^{Mc), wherec(x, 1^) 
is the charm parton distribution (cf. Eq. (|A.6| ) in Appendix 0). 

The relation between the choice of x (in the form of the rescaling variable Xc) 
and the proper treatment of kinematics was discussed in the previous subsection 
(Cf. also Appendix 0). The freedom associated with the choice of the Mc dependence 
of the hard scattering amplitudes was discussed in Refs. [8, 13]. The simplest choice 
that retains full accuracy can be stated succinctly as: keep the heavy quark mass 
dependence in the Wilson coefficients for partonic subprocesses with only light initial 
state partons {g,u,d,s); but use the zero-mass Wilson coefficients for subprocesses 
that have an initial state heavy quark (c, 6). This is known as the SACOT scheme. 
For the 4- flavor scheme to order (NLO), this calculational scheme entails: (a) 
keep the full Mc dependence of the gluon fusion subprocess; (b) for NC scattering 
(7/Z exchanges), set all quark masses to zero in the quark-initiated subprocesses; 
and (c) for CC scattering {W± exchange), set the initial-state quark masses to zero, 
but keep the final-state quark masses on shell (Cf. [7, 13]). 



2.6 Choice of Factorization Scale 

The final choice that has to be made is that of factorization scale /i, which connects 
the (soft) parton distributions and the hard scattering amplitude. Provided the 
matching between the LO and the subtraction terms are correctly implemented as 



described above (Sec.|2]^), the difference due to different choices of fi is formally of 
one order higher than that of the perturbative calculation — as long as fi is of the 
same order of magnitude as the physical hard scattering scale, say Q. However, in 
the threshold region, the scale dependence can be quite sensitive to the treatment of 
kinematics because of heavy quark mass effects. On the other hand, it was shown 
in Ref. [12] that, once the kinematics are handled correctly according to the ACOTx 
prescription (Sees. 2.2- |2.4|) , the /i dependence of the overall calculation becomes very 



mild. 

In the conventional ZM formalism, the natural choice of the hard scale (the 
typical virtuality) for the DIS process is Q. Hence fi = Q is almost universally used 
in all practical calculations. In the CM formalism, we should re-examine the possible 
choices. 

The total inclusive structure function is infrared safe. Consider the simple 
case of just one effective heavy flavor, charm (i.e. below the bottom and top produc- 
tion thresholds), 

ptot ^ pHght ^ pc ^ (2.7) 

for any given flavor-number scheme (i.e. 3-flavor, 4-flavor, ... etc.). If we use the 
same factorization scale for both terms, then the sum is insensitive to the value of 
/i — the logarithmic /x-dependence of the individual terms cancel each other. Since 
the right-hand side of Eq. ( p.7| ) is dominated by the light-flavor term p^.^^ht^ ^-^^ 
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natural choice of scale for this term is jj = Q, it is reasonable to use this choice 
for both terms. This turns out to be a good choice in practice as well, since the 
resulting F/"* is then continuous across the boundary separating the 3-flavor region 
(/i < Mc) from the 4-flavor region {fi > M^) — the line Q = in Fig. |l[ 

Experimentally, the semi-inclusive DIS structure functions for producing a charm 
particle in the final state is often presented. Unfortunately, theoretically, F^{x, Q, Mc) 
is not infrared safe beyond NLO. One may nonetheless perform comparison of NLO 
theory with experiment with the understanding that the results are intrinsically less 
reliable, and they can be sensitive to the choice of parameters. The analytic expres- 
sions for in PQCD suggest that the typical virtuality for this process is ^/Q^ + 
instead of Q. For the factorization scale in this case, the choice /i = ^Q^ + ap- 
pears to be natural. This choice has the added advantage that /i > for all 
physical values of Q; hence, in practice, with this choice, one stays always in the 4- 
flavor regime, avoiding the need to make a transition from 3- to 4-flavor calculations 
when Q crosses the value Mc, cf. Fig. ^7 

3. Differences between ZM and GM calculations 

The GM version of the n-fiavor scheme calculation reduces to the conventional ZM 
one when the hard scale Q is much larger than the quark mass Mn- Thus differences 
between the two schemes are only expected to be noticeable in the Q ~ M„ region. 
Similarly, the differences between the GM and ZM versions of the VFNS should occur 
mostly around the charm, bottom and top threshold regions of the (x, Q) plane. 

In general, among the various mass effects described in the previous section, the 
most significant one numerically is that due to rescaling, /(x, fi) — > /(x, /i)- The size 
of this effect depends on: (i) the size of the shift x — > xi and (ii) the rate of change 
of f{x, /i) at the relevant value of x. As can be seen from Fig.|l|, the size of the shift 
X — >■ X is largest when Q ~ M„. According to (ii) above, however, this effect will 
be significant only when f{x,fi) is large and rapidly varying in x. As we shall see 
below, this effect shows up most prominantly at small x. 

In the left panel of Fig. |^, we show the fractional differences between the GM and 
ZM calculations for -F^(x, Q) over the (x, Q) plane. The magnitude of the fractional 
difference (in percentage) is represented by the color coding shown along the right 
vertical axis, and by the dashed contour lines. The light solid curves are constant x 
lines, taking into account the c and b quark masses. The kinematic boundary (blue 
line) corresponds to the HERA energy reach. We see that the expectations of the 
previous paragraph are borne out. 

^There is no physical significance to the transition of Q across the value Mc- The physical 
threshold for producing charm is at = 2Mc. The ACOTx prescription ensures continuity across 
this threshold. 
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Figure 3: Fractional differences in percentages between the general-mass (GM) and zero- 
mass (ZM) calculations of F2 (left plot) and (right plot), represented by color coding 
marked along the right-side vertical axis and by the dashed contour curves. 

Mass effects can be more readily seen for physical quantities that vanish in 
the ZM limit. The most obvious example is the longitudinal structure function in 
DIS, Fl{x,Q), which vanishes at LO in the ZM formalism. It is therefore useful to 
examine the importance of mass effects in Fi quantitatively. In the right panel of 
Fig.^, we show the fractional differences between the GM and ZM calculations for 
F2{x,Q) over the {x,Q) plane. Compared to the F2 case, we see that, in addition 
to the effects of rescaling, the mass effects in the hard scattering is quite prominent. 
(Notice the different vertical scales of the two plots.) We see that the differences are 
more noticeable and spread wider in the charm and bottom threshold region than 
for the F2 case, due to the additional mass effects in the hard scattering amplitude. 

The differences between the GM and ZM calculations demonstrated here will 
have an impact on the global QCD analysis of PDFs, since the precision DIS data 
sets from both fixed-target and HERA cover the kinematic region highlighted in the 
above plots. Going from a ZM to a GM global analysis, the PDFs will undergo 
some re-alignment among themselves. And, for reasons described above, noticeable 
differences in the predictions for F^ are expected. 



4. New Global Analysis 

We now apply the improved implementation of the GM formalism to the global QCD 
analysis of the full HERA I DIS cross section data sets (cf. next subsection), along 
with fixed-target DIS, Drell-Yan (DY) data sets and Tevatron Run I inclusive jet 
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production data sets that were used in previous CTEQ global PDF studies, in order 
to obtain the most precisely determined parton distributions possible. 



4.1 Input to the Analysis 

Previous CTEQ global analyses of PDFs used structure function data for all available 
DIS experiments. By now, both HI and ZEUS experiments have published detailed 
cross section data from the HERA I runs (1994 - 2000) for both NC and CC processes. 
We are able to use these cross section data directly in the global analysis, so that 
the new analysis is free of the model-dependent assumptions that usually go into 
the extraction of structure functions. This is important since, in addition to the 
dominant ^2*^'^^, we can also gain model-independent information on the longitudinal 
and parity violating structure functions F2''^^ and F^^ from this more comprehensive 
study. For this new effort to yield more accurate PDFs, and to produce more rehable 
predictions on the various structure functions mentioned above, it is crucial to use 
the available correlated systematic errors in the global analysis, as we shall discuss 
below. 

The HERA I cross section data sets that are included in this analysis consist 
of the total inclusive NC and CC DIS processes, as well as the semi-inclusive DIS 
processes with tagged final state charm and bottom mesons. They are listed in Table 
I. 
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Table 1: HERA I table sets used in the global analysis. 



These data are supplemented by fixed-target and hadron collider data sets used 
in the previous CTEQ global fits: BCDMS, NMC, CCFR, E605 (DY), E866 proton- 
deuteron DY ratio, CDF T4^-lepton asymmetry, and CDF/DO inclusive jet produc- 
tion. Details and references to these can be found in Ref. [2]. We adopt the same Q- 
and W- cuts on experimental data as in Ref. [2]; and the stability of our results with 
respect to varying these cuts has been studied and reported in Ref. [29]. 
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4.2 Parametrization of non-perturbative initial PDFs 

The parametrization of the non-perturbative parton distribution functions is an im- 
portant aspect of global QCD analysis since the robustness and reliability of the 
resulting PDFs depends on a delicate balance between allowing enough flexibility in 
the functional forms adopted to represent the unknown physics on the one hand, and 
avoiding over-parametrization that exceeds the constraining power of the available 
experimental input on the other. 

For this round of analysis, we have carefully re-examined this issue, and tried 
a variety of functional forms, including exploring the number of degrees of freedom 
associated with each parton flavor that current experimental data can constrain. 
(A more detailed discussion is given in Sec.| on quantifying uncertainties). Results 
that are common to many reasonable choices of parametrization are considered trust- 
worthy; the most representative among the stable flts are then chosen as the new 
standards. This effort results in some minor streamlining of the parametrization used 
in previous CTEQ analyses [2] . Details are given in Appendix p. 

As with the previous analysis, we choose the input scale of /xq = 1-3 GeV (which 
is also the charm quark mass, Mc, used in our calculations). We assume the strange 
and anti-strange quarks are equal to each other (s = s), and are proportional to the 
non-strange sea combination {u + d) at /iq. We flnd that, within the current global 
analysis setup, the proportionality constant k (deflned as {s + s)/{u + d)) is only 
weakly constrained. For the purpose of the current analysis, we use the common 
value of K = 0.5 (at /iq = 1.3 GeV) that is well within the allowed range. Finally, 
as in all existing global analyses, we assume the c and h distributions to be zero at 
the scale corresponding to their masses, and are generated by QCD evolution above 
that. 

4.3 New Global Fits 

The new global fit with the improved theoretical calculation and more extensive DIS 
data sets results in even better agreement between theory and experiment than the 
previous fits of CTEQ6M/CTEQ6.1M [2], CTEQ6HQ [30], and CTEQ6AB [31]. « 
This is an important confirmation of the standard model (SM) in general, and the 
general mass PQCD formalism described in Sec. in particular. 

We choose a new central fit, designated as CTEQ6.5M, and 40 sets of eigen- 
vector PDFs that form an orthonormal basis characterizing estimated uncertainties 
in the parton parameter space according to the Hessian method described in [2, 32]. 
We shall describe the characteristics of the central fit in this section, and the full 
eigenvector sets in a later section on uncertainties (Sec.||). 

^In terms of the overall x^, used as a measure of the goodness-of-fit m our global analysis, the 
decrease is Ax^ ^ 200 for 2676 data points when the same new data sets are fitted with the ZM 
theory vs. the GM theory. This is a significant improvement. 
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The experimental data that have the most influence on the determination of the 
PDFs are, as is well known, the precision DIS total inclusive measurements, along 
with the DY experiments (sea quarks), W lepton asymmetry measurements (flavor 
differentiation), and collider inclusive jet production measurements (gluon). New to 
the current global analysis (in addition to the replacement of NC structure functions 
by the more complete cross section data) are the HERA CC total inclusive and NC 
tagged heavy flavor inclusive structure functions and cross sections.^ These new data 
sets are fit quite well in the new round of global analyses, demonstrating the con- 
sistency of the underlying PQCD framework. However, due to the limited statistics 
available for these processes, they do not provide any readily identifiable constraints 
within the confines of the general global analysis procedure. More dedicated studies, 
with targeted techniques, may be needed to uncover potential physical implications 
of these data and their HERA II successors. We shall not pursue these in this paper. 

As examples of the new fits to the HERA I cross section data, we show two plots 
comparing the HI and ZEUS 1999-2000 e+p NC reduced cross section data sets with 
the CTEQ6.5M fit. The x^/Npts (number of data points) for the two data sets 
are 169/147 and 94/90 respectively. The HI data set has 6 sources of correlated 
systematic errors {rj, ^ = 1, 6}. The optimal shifts of these errors for the CTEQ6.5M 
fit are {r^} = {0.378, -0.173, 0.413, 0.329, 0.544, -0.515}— all within one a, and 
Yl^=i "^1 = 1-012. (The definition of the parameters, as used in our treatment of 
correlated errors, can be found in Appendix B of [2].) The ZEUS data set has 8 
correlated systematic errors. The corresponding shifts are {rj} = {0.096, —0.110, 
-0.048, 0.385, -0.068, -0.651, -0.376, -1.334}— with only the last one being above 
1, and Yl^=i — 2.522. Both are quite reasonable. This pattern is typical for other 
HERA data sets. Details are available upon request to the authors. 

4.4 New Parton Distributions 

Since both the updates in theory and in experimental input in this global analysis 
represent incremental improvements over the previous CTEQ effort, rather than 
major modifications, we do not expect drastic shifts in the resulting PDFs. In the 
following discussions, we will focus on the few notable differences and their physical 
implications. 

To highlight the changes in the PDFs, we present the ratio of the new CTEQ6.5M 
distributions and the corresponding CTEQ6.1M ones, compared to the previously es- 
timated uncertainty bands of the latter. Figure | shows the d-quark, u-quark and 
gluon distributions at Q = 2 GeV. The CTEQ6.5M/CTEQ6.1M ratios are repre- 



^We remark, as mentioned in Sec. 2_^, the theoretical underpinning for the semi- inclusive heavy 
quark production process is less firm than for the total inclusive ones. But it does hold at order 
as — the order at which we perform this analysis. Incidentally, we found that the natural choice of 



scale for heavy flavor production cross section calculation mentioned in Sec. 2.6 leads to a noticeable 



but marginally significant, improvement in the overall global fit (compared to the default fi = Q). 
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HI e^pNC data (1999/00) - CTEQ6.5 PDFs 



ZEUS e+pNC data (1999/00) - CTEQ6.5 PDFs 




Figure 4: Comparison of the HI (left) and ZEUS (right) 1999-2000 e^p NC reduced cross 
section data sets with the CTEQ6.5M fit. 

sented by the solid curves. To illustrate the universal behavior of the new fits, we 
also include two dashed curves in each plot, representing equivalent good fits with 




Figure 5: CTEQ6.5M u, d and g distributions (solid curves) at scale /U = 2 GeV nor- 
malized to the corresponding ones from CTEQ6.1M. The shaded areas represent the esti- 
mated uncertainty band from the CTEQ6.1 analysis. The dashed curves represent alter- 
native, equally viable, candidates for this round of global analysis with slightly different 
parametrization forms than CTEQ6.5M. 
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alternative parametrizations mentioned earlier (second paragraph of Sec. [4.2|) . We 
do not show separate results on the sea and the valence distributions, since the u- 
and d-quark distributions are dominated by the former at small x, and the latter at 
large x — both visible in the existing plots. 

We see from Fig.|^ that the new PDFs are indeed generally within the previ- 
ously estimated uncertainty bands, demonstrating consistency with previous anal- 
yses. There is, however, a notable departure of the new quark distributions from 
the old ones in the region x ~ 10"'^. At the peaks of the ratio curves, the new 
distributions are up to 20% larger than CTEQ6.1M, and a factor of two outside the 
previously estimated error bands. This feature is shared by all choices of alternative 
parametrizations (and by all 40 sets of eigenvector PDFs to be discussed in the next 
section) . 

It is not surprising to see a shift in the extracted quark PDFs in the small-x 
and low-Q region, since this is where the theoretical treatment of quark mass effect 
matters (cf. Sec.|^, particularly the left plot of Fig.|^). To see that this shift is indeed 
caused by mass effects in the new calculation, we compare F2{x,Q'^ = 4 GeV^) 
calculated using CTEQ6.5M PDFs with the general-mass and the zero-mass Wilson 
coefficients, both normalized to the CTEQ6.1M result (which was obtained in the 
ZM formalism) in Fig.^ The difference between the two CTEQ6.5M calculations 




Figure 6: Comparison of theoretical calculations of F2 using CTEQ6.1M in the ZM 
formalism (horizontal line of 1.00), CTEQ6.5M in the GM formalism (solid curve), and 
CTEQ6.5M in the ZM formalism (dashed curve). 

is broadly in the small-x region, and it is of similar order of magnitude to that seen 
in Fig.^. Since the GM calculation (solid curve) yields lower values than the ZM 
calculation (dashed curve), the new CTEQ6.5M quark PDFs are pushed up in the 
new global analysis (cf. Fig. ^ in order to fit the same DIS data. The deviation of 
the GM CTEQ6.5M prediction is not far from the ZM CTEQ6.1M result (horizontal 
reference line, 1.00) in Fig. ^ since both are obtained by fits to the data. However, 
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they are not identical because there are several differences between the old and 
new global fits — DIS cross sections vs. structure functions as input, parametrization, 
etc. — in addition to the treatment of heavy quark mass effects. 

The heavy quark mass effects diminish with increasing . However, their effect 
on the analysis of experiments at low produces a change in the PDFs even at 
larger . Figure ^ shows the same comparisons at the scale = 25 GeV^. We 
see that the difference between CTEQ6.5M and CTEQ6.1M remains significant at 
this scale. Even at ~ 10^ GeV^ the CTEQ6.5 quark PDFs are higher than 
CTEQ6.1 by ~ 5 — 6% for x ~ 10~^ — approximately the CTEQ6.1 uncertainty at 
these (x, Q) values. This can result in significant increases in physical predictions 
on hadron collider cross sections that are sensitive to PDFs in this x range, e.g. for 
W/Z production at the LHC, the increase is ~ 8%, cf. Sec. ^. 

4.5 Mass effects, Low-Q^ HERA data, and Correlated systematic errors 

The noticeable changes in the quark distributions at low x and Q also suggest a closer 
examination of the comparison of the new theoretical predictions with the precision 
DIS data in this region, particularly because the longitudinal structure function is 
expected to play a substantive role in the understanding of the low x and Q HERA 
cross section data. 

Among the high precision HERA I data sets, the 1996-97 e^p NC reduced cross 
section measurements include data in the low x and Q region. We examine these 
data sets (from HI [15] and ZEUS [22] ) in a little more detail. The CTEQ6.5 fit to 
the HI data set has x^/Npts = 107/ 115; the shifts of the 5 correlated systematic 
errors are {r^} = {0.218, 1.361, -0.472, 0.374, 1.581} with J^rf = 4.763. The 
corresponding numbers for the ZEUS data set are x^/Npts = 279 / 227; the shifts 
of the 10 correlated systematic errors are {rj} = { — 1.575, —0.573, —1.407, —0.263, 
-0.025, -1.203, 1.278, 0.425, -0.258, 0.238}, with ^rf = 8.244. Aside from the 
somewhat high overall for the ZEUS data set (which was also seen in the previous 
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Figure 7 : Same as Fig. 0, except that the PDFs are at the scale ji = b GeV. 



- 18 - 



round of CTEQ6.1 analysis using the corresponding F2 data set), these numbers 
indicate reasonable fits. 

In Fig. we show the data from each experiment from the four lowest Q-bins 
that pass our Q > 2 GeV cut, normalized to the CTEQ6.5M fit. The comparison 
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Figure 8: Four low Q bins of the HI and ZEUS 1996-97 NC cross section data sets 
compared to the CTEQ6.5M fit. Open circles: the ratio data/theory; solid dots: the same 
data points shifted by correlated systematic errors. (See text.) 

with HI data (left plot) shows a pattern of "turnover" of experimental data points 
(open circles) at low x with respect to the theory for all four bins. (For given 
E and Q, low x corresponds to high y.) However, this discrepancy disappears when 
correlated systematic errors are included in the analysis, as seen from the fact that 
the solid dots (representing data corrected by systematic errors) fit rather well the 
theory predictions (the horizontal lines corresponding to 1.00 for the ratio). The 
values for the overall of this fit as well as the systematic shifts given in the 
previous paragraph support this observation. 

The comparison of the ZEUS data to the CTEQ6.5M fit (right plot) does not 
show the same systematic low-x turnover. Instead, data (open circles) in the three 
higher Q bins are generally below the theory prediction. Again, we see that the 
differences between the two go away when correlated systematic errors are included in 
the analysis (solid dots). We get acceptable fits in all bins with reasonable systematic 
shifts. 

The low-Q and high-y HERA data has been the subject of special analyses by 
both HI and ZEUS collaborations, mainly in the context of extracting the longi- 
tudinal structure function Fl{x,Q). Results of QCD fits performed in this regard 
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[15,33], appear to be similar to those shown above. Detailed comparison, however, 
is not possible at present since details of the theoretical input to the HERA analyses 
(e.g. issues related to mass effects described in Sec.||) has not been specified. As we 
have shown, heavy quark mass effects are important in this kinematic region. It 
would be useful for all future analyses to include the mass effects. 

The observed low-x turnover of the HI data has been considered a potential 
problem for global analyses by the MRST group [34,35]. This difficulty does not 
seem to arise in our analysis. Two possible sources could be responsible for this 
difference: (i) although both analyses include quark mass effects, the implementa- 
tions are not the same (cf. [10], compared to Sec.||); (ii) our inclusion of correlated 
systematic errors in the analysis is responsible for bringing theory and experiment 
into agreement, as demonstrated in Fig.||. A more detailed study of these issues is 
clearly called for. 

5. Uncertainties on New Parton Distributions 

Using the new theory implementation and experimental input, we have also per- 
formed a detailed study of the uncertainties of the PDFs, following the Hessian 
method of [2,32,36,37]. This involves finding a set of eigenvector PDFs that char- 
acterize the uncertainties of the parton distributions around the "best fit" in the 
parton parameter space. 

To ensure that this procedure will yield meaningful results, we first carry out a 
series of studies to match our fitting parametrizations with theoretical and exper- 
imental constraints: (i) first we make sure that the best fit is robust by checking 
that the quality of the global fit cannot be significantly improved by increasing the 
number of free parameters or changing the functional forms; (ii) next, we identify 
"fiat directions" in the resulting parameter space (representing degrees of freedom 
that are not constrained by current experimental input) by diagonalizing the Hessian 
matrix and examining its variation along the eigenvector directions; (iii) using that 
information, we freeze an appropriate subset of parameters and re-diagonalize the 
Hessian matrix, which characterizes the quadratic dependence of the fitting 

parameters in the neighborhood of the minimum. The number of eigenvectors that 
can reasonably be determined is around 20 — the same as was used in the CTEQ6.1 
analysis. 

To arrive at a quantitative estimate of the range of uncertainties in the parton 
parameter space, we examine the global Xgiobai ^^^^ used by the fitting program 
as a measure of the overall "goodness-of-fit" , as the parameters are varied in the 
neighborhood of the global minimum. In defining Xgiobai' include weight factors for 
a few experiments that have only a small number of data points. The eigenvectors and 
the weights are arrived at by the iterative method of Ref. [36] so that an adequate fit 
to every data set is maintained as far out as possible along the eigenvector directions. 
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This allows us to estimate a confidence range by the condition that, within this range, 
the fit to every data set is within its 90% confidence level. 

In this way we generate the final eigenvector PDF sets so that they span the 90% 
confidence range for the contributing data sets. The procedure is similar to that for- 
mulated in Refs. [2,32,37] (which contains more details). There are 40 eigenvector 
sets, corresponding to displacements from the central fit in the "+" or "— " senses 
along each of the eigenvector directions. ^'^ PDF uncertainty limits for a 90% confi- 
dence range for physical predictions can be calculated from these sets by computing 
the prediction for each of the 40 sets, and adding the approximately 20 upward 
(downward) deviations in quadrature to obtain the upper (lower) limit. 
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Figure 9: CTEQ6.5 PDF uncertainty bands and eigenvector PDF sets. 

Figure ^ shows the uncertainty bands determined by this method for u, d, and g 
PDFs at the scale = 4 GeV^. The lines represent each of the 40 sets of eigenvector 
PDFs normalized to the central fit. The upper (lower) edges of the shaded uncer- 
tainty region is obtained by adding in quadrature the contributions from eigenvector 
sets that lie above (below) the central fit at each particular x. We observe that, in 
some cases (such as for the gluon at large x), the uncertainty band is dominated 
by a single pair of eigenvector PDFs, corresponding to the two senses along a single 
eigenvector direction. 



Figure |T0| shows the same uncertainty bands as above, together with curves 
that show the fractional uncertainty range from CTEQ6.1 that was determined by 
a similar procedure. This shows that the two estimates of PDF uncertainties are 
broadly comparable with each other, with a slight tightening of the uncertainty 
ranges of d quark and gluon distributions in certain x regions. Of more interest is a 
comparison of estimated uncertainties of physical predictions. We shall discuss these 
in Sec. ^ below. 

^"The uncertainty bands are not always symmetric in the +/- directions since we independently 
generated + /- sets along each eigenvector direction in order to provide somewhat better approxima- 
tion to the uncertainties along the flatter directions, where there are deviations from the quadratic 
approximation. 
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Figure 10: CTEQ6.5 PDF uncertainty bands compared to those of CTEQ6.1. 



Because of the improved theoretical and experimental input to this new global 
analysis, as well as the much more thorough study of its parametrization dependence, 
we now have greater confidence in these uncertainty estimates than before. Wider 
applications of the new results to Standard Model and New Physics processes at 
hadron colliders will be pursued. 



To compare some currently used PDFs to that of CTEQ6.5, Figure O shows 
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Figure 11: Same CTEQ6.5 PDF uncertainty bands compared to: (i) the CTEQ6.1M 
(green long-dash line), CTEQ6A118 (blue short-dash) and MRST04 (black dash-dotted) 
PDFs; and (ii) upper and lower edges (red solid lines) of similar uncertainty bands generated 
with reduced number of fitting parameters (see text). 



CTEQ6.1M (dashed curves), CTEQ6A118 (central fit of the CTEQ6 "a, series" 
[31]), and MRST04 [3] PDFs as ratios to CTEQ6.5M. We note that the previous 
CTEQ PDFs lie mainly within the new uncertainty bands — except at x ~ 10~^ for 
the reasons discussed in the various subsections of Sec. ^ The MRST04 quark PDFs 
are closer to CTEQ6.5 in the x ~ IQ-^ region than CTEQ6.1M and CTEQ6AB, 
presumably because both CTEQ6.5 and MRST04 include mass effects (albeit using 
different prescriptions) while the other two are in the ZM formalism. The MRST04 
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gluon PDFs are within the CTEQ6.5 uncertainty band at large-x; but they are 
outside the band around x = 0.3 and at small x. 

As mentioned earlier, as a part of our investigation on the robustness of our 
results, we have performed uncertainty analysis using different number of variable 
parameters. (Other global analysis groups have always used fewer variables.) We 
found the resulting ranges of uncertainty stable if this number is greater than 16. 
To show how the results may be affected by too restrictive choices of parameters. 
Figure |TT] also include two curves (red) that correspond to the edges of the uncer- 
tainty bands using only 11 free parameters. We see that these bands are considerably 
narrower than the stable results represented by the CTEQ6.5 bands. In other words, 
over-restricting the degrees of freedom in the input parametrization at /io can signif- 
icantly overestimate how well the PDFs are measured. 



6. Implications for Hadron Collider Physics 

The new PDFs have significant implications for hadron collider phenomenology at 
the Tevatron and the LHC We shall mention one example here: the benchmark W 
production total cross section aw- 

The higher quark distributions in CTEQ6.5 for x ~ 10-3 lead to an mcrease m 
the predicted values for aw over those based on CTEQ6.1: for Tevatron Run II, we 
get ^aw /crw = 3.5%; and for the LHC, ^aw /c^w = The significant increase 

of the predicted aw at the LHC reflects the fact that it is directly dependent on 
PDFs in the region x ~ 10"^. 

It is also interesting to compare the uncertainties of these predictions as es- 
timated by the Hessian method. For the Tevatron, we find this uncertainty to be 
+3.1/-3.0% for CTEQ6.5, compared to +3.8/-4.4% estimated using CTEQ6.1. For 
the LHC, the uncertainty is +4.9/-4.1% for CTEQ6.5, compared to +5.2/-5.9% 
for CTEQ6.1. Thus there is a notable reduction in the uncertainty ranges. This is 
perhaps related to a better determination of the correlations between different parton 
flavors, in addition to the obvious connection to the uncertainties of the individual 



PDFs (which are rather comparable for CTEQ6.1 and CTEQ6.5) shown in Figure |T0 
Detailed results on W/Z production, including differential distributions, and 
implications of these new PDFs on other SM and New Physics processes for hadron 
colliders will be explored in a separate study. 



7. Summary and Outlook 

We have updated both the theoretical and experimental input to the CTEQ global 
QCD analysis in this work. On the theory side, we have used a newly implemented 

^^The predicted values for the total W production cross section using CTEQ6.5M is 24.7 nb at 
the Tevatron II, and 202 nb at the LHC. 
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systematic approach to PQCD, including heavy quark mass effects according to the 
general formahsm of Collins [8] . Experimentally, we have used all available precision 
HERA I cross section data sets, along with well-established fixed target and hadron 
collider experimental data. Correlated systematic errors, whenever available, are 
fully incorporated in the analysis. In performing this analysis, we have extensively 
studied the effects due to changes of the functional form for the initial parton dis- 
tributions, and to changes in the number of free parameters allowed in the fits, in 
order to arrive at stable and physically meaningful results. 

One noticeable common feature of the new parton distributions, compared to 
the previous ones, is the change in the u and d quark distributions around x ~ 
10~^ at low Q. This results from the inclusion of mass effects in the theory, in 
conjunction with the improved (model independent) HERA cross section data, over 
the previous analysis (using F2 data that inevitably involve assumption about for 
their extraction). 

Within the general-mass PQCD framework, we find broad consistency between 
the extensive data sets incorporated in this analysis. This permits us to arrive 
at 90% confidence level estimates of the uncertainties of PDFs around the chosen 
central fit, CTEQ6.5M. These uncertainties are encapsulated in 40 sets of eigenvector 
PDFs that span the neighborhood of the central fit in the parton parameter space. 
We found the new uncertainty bands of the PDFs are shghtly narrower than the 
previous ones, but are generally of the same order of magnitude. 

While progress towards better-determined PDFs in a more precisely formulated 
PQCD framework has clearly been made, it is worthwhile to mention some of the 
limitations of current global analysis of PDFs that call for continued advances in both 
theory and experiment. First, due to rather weak existing experimental constraints 
of the strange quarks, we have assumed in this work that {s + s) is of the same shape 
as (m + d) and fixed the proportionality constant k at the initial scale Qo during 
the fit. In principle, better constraints on k, and on the possible difference between 
s and s, can come from recent neutrino scattering experiments by NuTeV [38] and 
Chorus [39]. However, many open questions pertaining to the consistency between 
existing experimental data sets (including those from the "old" experiments CDHSW 
and CCFR), nuclear target corrections, and other issues make the use of these data 
controversial at present. A dedicated study on the strangeness sector, designed to 
delineate the range of uncertainties of both s{x) and s{x), will require a somewhat 
different tactic, focused on the most relevant degrees of freedom. In the same vein. 

Given the faet that several individual experimental data sets show much larger fluctuations than 
expected from normal statistics, and that problems of statistical compatibility between different 
experiments of the same type are not uncommon, it is well-known that this complex system is too 
un-textbook-like to be amenable to strict "1 <j error'' analysis of the PDF parameters. As mentioned 
in the main body of this paper, we generally apply 90% confidence level "goodness-of-fit" criteria 
in our uncertainty studies. 
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we have assumed that all heavy quark partons (c and b) are generated radiatively by 
QCD evolution (mainly gluon splitting) at the initial scale Qo- This assumption is 
not well defined quantitatively, because it depends on the choice of Qq. Furthermore 
the assumption itself may be questioned — does intrinsic charm exist in the proton? 
The charm and bottom degrees of freedom can be investigated phenomenologically 
within the general-mass PQCD framework described here, and is currently under 
investigation [40]. 

There has been considerable discussion in recent literature [41] about extending 
global QCD analysis to higher orders in as- This interest is spurred, on the one hand, 
by the final availability of the NNLO evolution kernel [42], and on the other hand, 
by the calculation of NNLO Wilson coefficients for various hard scattering processes, 
such as Higgs production [43]. In the global analysis of PDFs, NNLO becomes im- 
portant, and, by implication, necessary to include, when the theoretical corrections 
to NLO calculations are comparable to the errors on the corresponding input experi- 
mental data. For DIS and DY processes used in current global analysis, this happens 
only in the small-x region [42]. However, near boundaries of the kinematic region, 
such as small x, the relatively large corrections are generally associated with higher 
powers of large logarithms in PQCD. These are symptoms of the breakdown of the 
fixed-order perturbation expansion and the need to resum these logarithms in order 
to achieve stable and reliable results. Thus, efforts to expand global QCD analysis 
to higher orders must go hand-in-hand with work to incorporate resummation effects 
(not only confined to small x) in the theoretical framework. The importance of 
much development work along both lines is evident. 

The new CTEQ6.5 PDF sets, including the eigenvector sets discussed in Sec. |], 
will be made available at the CTEQ web site (http://cteq.org/) and through the 
LHAPDF system (http:/ /hepforge.cedar.ac.uk/lhapdf/). Because the CM formalism 
used in this analysis represents a better approximation to QCD, compared to the 
previously used ones, the new PDFs are expected to be closer to the true values than 
previous ones. In physical applications at energy scales much larger than Mc and 
Mb, these PDFs can be convoluted with commonly available hard-scattering cross 
sections calculated in the ZM formalism to obtain reliable predictions, because quark 
mass effects in the Wilson coefficients will be negligible. However, for quantitative 
comparison of a theoretical calculation to precision DIS data, in the region where 
Mc/Q and M^/Q are not very small, the CM formalism described in Sec.^ must be 
applied to the hard scattering cross section together with CTEQ6.5 PDFs in order 
to obtain accurate results. 



-'^^Much recent progress on small- a; resummation, and its possible application the global QCD 
analysis, has been reported at the DIS2006 Workshop, and reviewed in [44]. Other types of re- 
summation, such as transverse momentum and threshold resummations, have also come to the fore 
because of their importance for LHC phenomenology. 
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A. Rescaling 

We give here an intuitive derivation of the rescaling prescription (ACOTx) in the 
context of NC charm production. The general idea is applicable to other heavy flavor 
production processes, as will become clear later. As mass effects are only relevant 
at energy scales comparable to the heavy quark mass (charm in our case), we will 
focus in this region where the physically dominant mechanism for charm production 
is the gluon fusion process 7*5' — > cc. The graphical representation of this process is 
shown as the first term in Fig.|T^. The analytic expression for this contribution to 




Figure 12: Contributions to charm production in NC DIS scattering and the physical 
origin of the rescaling variable in the PQCD approach. 

the physical structure function is 

f-g{z,i^H^,^A-.Q.M,) (A.l) 

where a^w^.^^^g is the order partonic cross section, g{z, fi) is the gluon distribution 
function. The lower limit of the convolution integral is determined by the boundary 
of the final state phase space integration; it is 

X. = x(l + 4M2/g2) (A.2) 

with X being the Bjorken x. We recognize that this is just the rescaling variable 
Eq. (|2.6|) of Sec. p.4[ For the gluon-fusion subprocess, it follows strictly from kine- 
matics, once we include the mass of the heavy quarks into consideration. 
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The gluon fusion term by itself is not infrared safe at high energies. In this 
hmit, the infrared unsafe part comes from the colhnear configuration in the final 
state phase space integration. The singular part of uj^g^^^ is of the form 



, Q, M,) ^ ln(^)P,^,(;.) (A.3) 



where w^.^^c represents the order a° process 7*c c (upper vertex of vector-boson 
coupling to quark); and Pg^dz) is the g ^ c splitting function. This collinear 
singularity of the '~f*g — >■ cc contribution can be removed by a subtraction term of 
the form 

ln(|-) ^9{z, /x)P,->e(^) (A.4) 

with the introduction of the factorization scale fi (generally chosen to be of order 
Q). This term is represented by the second graph in Fig. where the mark on the 
internal charm parton line signifies that its momentum k is collinear to the gluon 
and ^ 0. For the purpose of cancelling the collinear singularity of the gluon-fusion 
term at high energies (the Bjorken limit), the variable ( in this expression can be 
any expression provided ( ^ x in that limit. In fact, conventionally, it is taken to 
be just X. The trouble with this choice is that in the other limit — near the threshold 
region where W and Q are of the order of — the subtraction term knows nothing 
about the kinematics of cc pair production, hence bears no relation to the physical 
structure function. The combined result of Eqs. ( [A.l| ) and ( [A.4| ) is then completely 
artificial, hence unphysical, in this region. To remedy this problem, one only needs 
to realize the origin of the subtraction term, and make the obvious choice 

( = Xc = x{l + 4M'JQ') (A.5) 



that is appropriate for the parent gluon-fusion contribution, Eq. (|A.1| ). With this 



choice of the scaling variable (, the subtraction term, Eq. ([A.4|) behaves correctly 
both in the high-energy and the low-energy regions. 



To complete the derivation, we need to turn to the third diagram of Fig. |T2 
which represents the simple order-a^ 7*c c parton process in the 4-fiavor scheme. 
From the perspective of the preceding discussion, this term arises from resumming 
the collinear and soft singularities to all orders in the perturbation expansion. The 



leading term in this expansion is given by Eq. ( [A. 41 ) above (with a positive sign). The 



resummed result, as illustrated by the diagram, is just 

c(C,/xX,^, (A.6) 

Here, the choice of the scaling variable ( should be dictated by similar considerations 
as above: at high energies, ( must reduce to the Bjorken x; and in the threshold 
region, the combined contribution from Eq. ( [A.6| ) and the subtraction term, Eq. ([A.4| ), 



-27- 



must be of higher order in ag- The naive (and common) choice C, = x satisfies the 
first criterion, but not the second. For the same reason discussed before, the choice 
C = Xc5 Eq. ( |A.5| ), satisfies both; hence it is the physically sensible one to use. 

With the use of the rescaling variable C = Xc for the LO 7*c c term and the 
subtraction term, the sum of the three contributions in Fig.|l2| reduces, by defini- 
tion, to the gluon fusion contribution in the threshold region, as it should; and it 
approaches the conventional zero-mass PQCD form, LO (7*0 c) + (NLO correc- 
tion), in the high energy limit, as it should. From this perspective, one sees clearly 
the dual role of the subtraction term: in the threshold region, it overlaps substan- 
tially with the LO (7*0 c) contribution to make the gluon fusion subprocess the 
primary production mechanism; and in the high energy limit, it overlaps with the 
singular part of the '-j*g cc contribution, and thus helps to render the combined 
order terms infrared safe (and yield the true NLO correction to the perturbative 
expansion) . 



B. Parametrization 

The parametrization of the parton distributions at /io that was used to obtain the 
CTEQ5 and CTEQ6 parton distributions contained 5 shape parameters (apart from 
normalization) for each fiavor. However, the global analysis data sets were not suffi- 
ciently constraining to determine all of these parameters, so a number of them were 
frozen at some particular values. 

In the current effort to match the theoretical parametrization with experimental 
constraints, we achieve the same goal by adopting a simpler form with 4 shape 
parameters for the valence quarks Ut,(x), dv{x), and the gluon g{x)\ 

/(x) = aox"^(l-a;)"^e"^" + "^"' . (B.l) 

This can be seen as a plausible generalization of the conventional minimal form 

h{x) = aox''^{l-xr^ , (B.2) 

which combines Regge behavior at x ^ and spectator counting behavior at x ^ 1 
in an economical way. 

Both functions, ( [B.l| ) and ( [B.2| ), are conveniently positive definite. The following 
modified logarithmic derivatives of these functions are simple polynomials in x, 

0o(x) = —X (1 — x) = — ai + bix , (B.3) 

ax 

and 

d 111 f 

(j)(x) = -X (1 - x) = -ai + bix + b2x'^ + bsx^ , (B.4) 

ax 
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where the coefficients {bi} are simple hnear combinations of the original ones {aj} 
in the exponent of Eq. ( |B.1| ). So the form Eq. ([B.l| ) corresponds to generalizing the 
logarithmic derivative from a linear function (f)o{x) to a cubic polynomial (f){x), and 
follows in the spirit of using polynomials to approximate unknown functions that 
have no known singularities. The only practical question is whether this polynomial 
generalization contains enough flexibility to represent the physical PDFs that we 
are trying to determine. Our investigation indicates that this is the case, since 
significantly better fits cannot be achieved by introducing additional parameters or 
by changing the functional forms. 

We continue to use the same parametrizations for u, d that were used in CTEQ6. 
As mentioned in the text (Sec. [4.2|) , we continue to use the approximation s{x) = 
s{x) oc u{x) + d{x). The full set of formulas for the initial PDFs at /iq = 1.3 GeV is 

u,{x), d,{x),g{x) = Ao x^'-^ (1 - a;)^^ ^-Asii-xf + a,x^ ^g ^^ 

u{x) + d{x) = iAoX^i-i(l-x)^2e^3x(i^^gA4)A5 (Bg) 

d{x)/u{x) = e^'x'^^-^il-x)^' + {1 + A4X) {1 - x)^' (B.7) 

s{x) = s{x) = Ik{u{x) + d{x)) (B.8) 

c{x) = c{x) = b{x) = b{x) = (B.9) 

(Notes regarding these formulas: the parameter Ai is shifted from the ai above to 
make it correspond in definition to the standard Regge intercept; the parameters 03 
and 04 are replaced by linear combinations ^3 and A4 to reduce their correlation in 
the fitting by making them control behavior at large and small x; the parameter ^4 
in u + d is defined using an exponential form to ensure positivity of the PDFs.) 

For concreteness, we give in Table II the coefficients that correspond to the 
central fit CTEQ6.5M. The value of k is 0.5; and the strong coupling constant as{Mz) 
is 0.118. We use = 1.3 GeV, Mb = 4.5 GeV. 

^0 Ai A2 A3 Aj A5 

5.76388 0.64416 2.31531 0.74528 -2.14868 
4 3.65786 0.62677 3.31531 0.87977 -2.37338 
g 0.09974 0.22853 4.00000 -4.23974 8.64169 
d + u 0.29563 -0.22165 12.09400 6.46763 4.51075 0.26278 
d/u 11.49257 5.64186 17.00000 19.41872 9.45863 
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